Airport Network Analysis

Author

Farhan

Ego Network

Code
# Reading the ego network dataset
ego_net_link = "https://notes.farhansadeek.com/dartmouth/math7/homework/Ego_Network.csv"
ego <- read.csv(ego_net_link)
ego_network <- graph_from_data_frame(ego, directed = FALSE)
plot(ego_network, vertex.size=5, vertex.label=NA)

Flight Network

First of all, we need to read the dataset. Since we are reading a large amount of data, it’s scattered across multiple CSV files so we have to read and combine all of them. Now, we will first look at the files in the directory to make sure they exist there.

Code
## Listing all the files in the 'dataset' directory
files <- list.files(path = "dataset", pattern = "flightlist_.*\\.csv$", full.names = TRUE)
files

Perfect, all the files exist in that dataframe. Now we will read all of them in order and combine that into a massive dataset.

Code
df <- files |> map_df(~read_csv(.))
Rows: 2152157 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 842905 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1088267 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1444224 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1905528 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2042040 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1930868 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (7): callsign, number, icao24, registration, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1985145 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1825015 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1894751 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1783384 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1617845 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2079436 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2227362 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2278298 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2540487 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2840201 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2794400 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2523676 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2726252 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): callsign, number, aircraft_uid, typecode, origin, destination
dbl  (6): latitude_1, longitude_1, altitude_1, latitude_2, longitude_2, alti...
dttm (3): firstseen, lastseen, day

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Since we just read a massive amount of data, let’s check how much memory it is using in the RAM.

Code
format(object.size(df), units = "Gb")
[1] "5.2 Gb"

Since we are using about 5 GB of memory, we will take a small (0.01%) random sample of the data to work with just for efficiency purposes.

Code
sampled_df <- df |> slice_sample(prop = 0.0001)

Now, there are a lot of columns in the dataset that we don’t need at all. So we will select only the ones that we need for our analysis. We need to have origin, destination, latitude_1, longitude_1, latitude_2, and longitude_2 for our analysis. Here longitude_1, latitude_1 maps to origin and longitude_2, latitude_2 maps to destination. We will also drop any rows that have NA values in these columns, after that.

Code
sampled_df <- sampled_df |> select(origin, destination, latitude_1, longitude_1, latitude_2, longitude_2) |> drop_na()
[1] "0.2 Mb"

Great, now that we have the data we need we can finally start working on it. However, we can’t work with this immediately we need to convert this into a list of vertex, and edge list. We will create the list of edges first, which is just the count from the origin to the destination, and this would be a directed graph.

Code
edges <- sampled_df |> group_by(origin, destination) |> summarise(weight = n())
`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by origin and destination.
ℹ Output is grouped by origin.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(origin, destination))` for per-operation grouping
  (`?dplyr::dplyr_by`) instead.

Now, we will create the list of vertices, which is just the airports in the dataset. For this since we have destinations and origins we will first get all the origin nodes, and their information, and then the destinations and their information. When we combine them then we get the full list of nodes.

Code
origins <- sampled_df |>
  select(name = origin, lat = latitude_1, long = longitude_1)

destinations <- sampled_df |>
  select(name = destination, lat = latitude_2, long = longitude_2)

nodes <- bind_rows(origins, destinations) |>
  distinct(name, .keep_all = TRUE) |>
  na.omit()

So, now we have the nodes and the edges to work with. First we make this into a graph if we want to be able to do anything with it. We will also remove all the self-loops.

Code
flight_network <- graph_from_data_frame(d = edges, vertices = nodes, directed = TRUE)
flight_network <- simplify(flight_network, remove.multiple = TRUE, remove.loops = TRUE)
flight_network
IGRAPH f1c07f1 DNW- 1640 2048 -- 
+ attr: name (v/c), lat (v/n), long (v/n), weight (e/n)
+ edges from f1c07f1 (vertex names):
 [1] KIAD->KSEA KIAD->KTYS KIAD->KSFO KIAD->KBUF KIAD->KRIC KIAD->KEWR
 [7] KIAD->KBNA KIAD->MPTO KIAD->KAUS KIAD->KHPN KIAD->KHSV KIAD->KABE
[13] KIAD->PN64 KIAD->60CO KCLT->KIAD KCLT->KSEA KCLT->KDCA KCLT->KPHX
[19] KCLT->KSDF KCLT->KCMH KCLT->KDTW KCLT->KORF KCLT->KMDT KCLT->KEWR
[25] KCLT->KRSW KCLT->KECP KCLT->NC03 KCLT->19KY KCLT->KMLB KCLT->KBFM
[31] KCLT->KAKH KCLT->1WI6 KCLT->VA27 NZRO->NZTO KLAX->KDCA KLAX->KDEN
[37] KLAX->KLAS KLAX->KPHL KLAX->KPHX KLAX->KFLL KLAX->KABQ KLAX->KSFO
[43] KLAX->KBOS KLAX->KTUS KLAX->KSLC KLAX->YSSY KLAX->KJFK KLAX->KBNA
+ ... omitted several edges

Basic Graph Properties

Code
cat("Number of airports (vertices):", vcount(flight_network), "\n")
Number of airports (vertices): 1640 
Code
cat("Number of flight routes (edges):", ecount(flight_network), "\n")
Number of flight routes (edges): 2048 
Code
cat("Is the graph simple?", is_simple(flight_network), "\n")
Is the graph simple? TRUE 
Code
cat("Is the graph connected (weakly)?", is_connected(flight_network, mode="weak"), "\n")
Is the graph connected (weakly)? FALSE 
Code
cat("Is the graph connected (strongly)?", is_connected(flight_network, mode="strong"), "\n")
Is the graph connected (strongly)? FALSE 
Code
cat("Diameter:", diameter(flight_network, weights=NA), "\n")
Diameter: 14 
Code
cat("Average path length:", mean_distance(flight_network), "\n")
Average path length: 5.457955 

Vertex and Edge Characteristics

Degree Distribution

Code
par(mfrow=c(1,2))
hist(igraph::degree(flight_network),
     col = "blue",
     xlim = c(0, max(igraph::degree(flight_network))),
     breaks = 50,
     xlab = "Vertex Degree",
     ylab = "Frequency",
     main = "Degree Distribution")

# Log-log degree distribution
dd.flights <- degree_distribution(flight_network)
max_d <- max(igraph::degree(flight_network))
d <- 0:max_d
ind <- (dd.flights != 0)
plot(d[ind], dd.flights[ind],
     log = "xy",
     col = "blue",
     xlab = "Log-Degree",
     ylab = "Log-Intensity",
     main = "Log-Log Degree Distribution")
Warning in xy.coords(x, y, xlabel, ylabel, log): 1 x value <= 0 omitted from
logarithmic plot

Vertex Strength

Code
par(mfrow=c(1,2))
hist(igraph::degree(flight_network), col="lightblue",
     xlab="Vertex Degree", ylab="Frequency", main="Degree")
hist(strength(flight_network), col="pink",
     xlab="Vertex Strength", ylab="Frequency", main="Strength")

Average Neighbor Degree

Code
a.nn.deg.flight <- knn(flight_network, V(flight_network))$knn
plot(igraph::degree(flight_network), a.nn.deg.flight,
     log = "xy",
     col = "blue",
     xlab = "Log Vertex Degree",
     ylab = "Log Average Neighbor Degree",
     main = "Degree vs Average Neighbor Degree")
Warning in xy.coords(x, y, xlabel, ylabel, log): 129 x values <= 0 omitted from
logarithmic plot

Vertex Centrality Measures

Code
# Top 10 airports by different centrality measures
deg <- igraph::degree(flight_network, mode="all")
cat("Top 10 by Degree:\n")
Top 10 by Degree:
Code
head(sort(deg, decreasing=TRUE), 10)
KORD KATL KLAS KLAX KDFW KDEN KPHX KSEA KMSP KCLT 
  59   58   52   47   46   45   39   37   34   33 
Code
cat("\nTop 10 by In-Degree:\n")

Top 10 by In-Degree:
Code
head(sort(igraph::degree(flight_network, mode="in"), decreasing=TRUE), 10)
KORD KATL KLAS KDFW KLAX KDEN KPHX KSEA KMSP KBNA 
  30   28   26   23   22   21   21   20   17   16 
Code
cat("\nTop 10 by Out-Degree:\n")

Top 10 by Out-Degree:
Code
head(sort(igraph::degree(flight_network, mode="out"), decreasing=TRUE), 10)
KATL KORD KLAS KLAX KDEN KDFW KCLT EGLL KPHX KSEA 
  30   29   26   25   24   23   19   18   18   17 
Code
# Closeness centrality (top 10)
cl <- igraph::closeness(flight_network, mode="all")
cat("Top 10 by Closeness Centrality:\n")
Top 10 by Closeness Centrality:
Code
head(sort(cl, decreasing=TRUE), 10)
NZRO XS86 19NK LFGC KDAW 2IS4 OMAD KAVQ MI58 KPVG 
   1    1    1    1    1    1    1    1    1    1 
Code
# Betweenness centrality (top 10)
btwn <- igraph::betweenness(flight_network)
cat("Top 10 by Betweenness Centrality:\n")
Top 10 by Betweenness Centrality:
Code
head(sort(btwn, decreasing=TRUE), 10)
    EGLL     KORD     KATL     KDFW     KLAS     KLAX     KDEN     LFPG 
42100.87 41425.43 37787.45 32238.81 30905.17 25903.58 25493.66 25313.44 
    KJFK     KMSP 
23295.37 21342.37 
Code
# Hub and authority scores
cat("Top 10 Hub Airports:\n")
Top 10 Hub Airports:
Code
hub_scores <- hub_score(flight_network)$vector
Warning: `hub_score()` was deprecated in igraph 2.0.3.
ℹ Please use `hits_scores()` instead.
Code
head(sort(hub_scores, decreasing=TRUE), 10)
     KORD      KDEN      KLAS      KSFO      KDFW      KLAX      KATL      KMIA 
1.0000000 0.7692480 0.7616491 0.7186017 0.6403326 0.6171922 0.6060057 0.4917910 
     KMCO      KSEA 
0.4795538 0.4725807 
Code
cat("\nTop 10 Authority Airports:\n")

Top 10 Authority Airports:
Code
auth_scores <- authority_score(flight_network)$vector
Warning: `authority_score()` was deprecated in igraph 2.1.0.
ℹ Please use `hits_scores()` instead.
Code
head(sort(auth_scores, decreasing=TRUE), 10)
     KPHX      KLAX      KLAS      KSEA      KDFW      KSAN      KATL      KORD 
1.0000000 0.8598761 0.8290710 0.7577671 0.6250533 0.6152005 0.5223832 0.4887282 
     KDEN      KSDF 
0.4694138 0.4203836 

Target Plot: Top 200 Flight Hubs

Code
# Four centrality target plots for top hubs
# Load sna and network here for gplot.target
library(network)

'network' 1.20.0 (2026-02-06), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information

Attaching package: 'network'
The following objects are masked from 'package:igraph':

    %c%, %s%, add.edges, add.vertices, delete.edges, delete.vertices,
    get.edge.attribute, get.edges, get.vertex.attribute, is.bipartite,
    is.directed, list.edge.attributes, list.vertex.attributes,
    set.edge.attribute, set.vertex.attribute
Code
library(sna)
Loading required package: statnet.common

Attaching package: 'statnet.common'
The following objects are masked from 'package:base':

    attr, order, replace
sna: Tools for Social Network Analysis
Version 2.8 created on 2024-09-07.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
 For citation information, type citation("sna").
 Type help(package="sna") to get started.

Attaching package: 'sna'
The following objects are masked from 'package:igraph':

    betweenness, bonpow, closeness, components, degree, dyad.census,
    evcent, hierarchy, is.connected, neighborhood, triad.census
Code
n_top <- min(200, vcount(flight_network))
top_nodes <- order(igraph::degree(flight_network), decreasing=TRUE)[1:n_top]
hubs_subgraph <- induced_subgraph(flight_network, top_nodes)

A_sub <- as.matrix(as_adjacency_matrix(hubs_subgraph))
net_sub <- as.network.matrix(A_sub)

par(mfrow=c(2,2))

gplot.target(net_sub, sna::degree(net_sub),
    main="Degree", circ.lab = FALSE, circ.col="skyblue",
    usearrows = FALSE, edge.col="darkgray")

gplot.target(net_sub, sna::closeness(net_sub),
    main="Closeness", circ.lab = FALSE, circ.col="skyblue",
    usearrows = FALSE, edge.col="darkgray")

gplot.target(net_sub, sna::betweenness(net_sub),
    main="Betweenness", circ.lab = FALSE, circ.col="skyblue",
    usearrows = FALSE, edge.col="darkgray")

gplot.target(net_sub, sna::evcent(net_sub),
    main="Eigenvector", circ.lab = FALSE, circ.col="skyblue",
    usearrows = FALSE, edge.col="darkgray")

Edge Betweenness

Code
# Top edges by betweenness
eb <- igraph::edge_betweenness(flight_network)
top_edges <- E(flight_network)[order(eb, decreasing=TRUE)[1:10]]
cat("Top 10 edges by betweenness:\n")
Top 10 edges by betweenness:
Code
print(top_edges)
+ 10/2048 edges from f1c07f1 (vertex names):
 [1] KJFK->EGLL LSGG->EGLL LFPG->KORD EDDH->EDDK EGLL->KORD KFXE->KJQF
 [7] KDFW->LEMD EGKB->LSGG KJQF->KATL KMSP->LFPG

Network Cohesion

Clique Analysis

Code
# Clique census (on undirected version)
flight_undirected <- igraph::as.undirected(flight_network, mode="collapse")
Warning: `as.undirected()` was deprecated in igraph 2.1.0.
ℹ Please use `as_undirected()` instead.
Code
clique_sizes <- sapply(igraph::cliques(flight_undirected), length)
cat("Clique census:\n")
Clique census:
Code
table(clique_sizes)
clique_sizes
   1    2    3    4    5    6 
1640 1953  697  366  117   13 
Code
cat("\nClique number:", igraph::clique_num(flight_undirected), "\n")

Clique number: 6 

Density and Clustering

Code
cat("Graph density:", igraph::edge_density(flight_network), "\n")
Graph density: 0.0007619161 
Code
cat("Global transitivity:", igraph::transitivity(flight_undirected), "\n")
Global transitivity: 0.1118601 
Code
# Ego-centric density for top hub airports
top_airport <- V(flight_network)$name[which.max(igraph::degree(flight_network))]
ego.top <- igraph::induced_subgraph(flight_network,
    igraph::neighborhood(flight_network, 1, which.max(igraph::degree(flight_network)))[[1]])
cat("Overall density:", igraph::edge_density(flight_network), "\n")
Overall density: 0.0007619161 
Code
cat("Top hub (", top_airport, ") ego density:", igraph::edge_density(ego.top), "\n")
Top hub ( KORD ) ego density: 0.07162823 
Code
# Local clustering for top 5 airports by degree
top5 <- order(igraph::degree(flight_undirected), decreasing=TRUE)[1:5]
cat("Local transitivity for top 5 airports:\n")
Local transitivity for top 5 airports:
Code
local_cl <- igraph::transitivity(flight_undirected, "local", vids=top5)
names(local_cl) <- V(flight_undirected)$name[top5]
print(local_cl)
      KATL       KORD       KDFW       KLAX       KLAS 
0.06195286 0.08998549 0.10147992 0.11849391 0.12624585 

Connectivity

Code
# Component analysis
comps <- igraph::decompose(flight_network, mode="weak")
comp_sizes <- sapply(comps, igraph::vcount)
cat("Number of weakly connected components:", length(comps), "\n")
Number of weakly connected components: 373 
Code
cat("Component size distribution:\n")
Component size distribution:
Code
table(comp_sizes)
comp_sizes
  1   2   3   4   5   6  11 939 
129 188  38  11   3   2   1   1 
Code
# Giant component
flight.gc <- comps[[which.max(comp_sizes)]]
cat("\nGiant component:", igraph::vcount(flight.gc), "vertices,",
    igraph::ecount(flight.gc), "edges\n")

Giant component: 939 vertices, 1714 edges
Code
cat("Fraction of vertices in giant component:",
    igraph::vcount(flight.gc)/igraph::vcount(flight_network), "\n")
Fraction of vertices in giant component: 0.572561 
Code
# Small world properties of giant component
cat("Average path length:", igraph::mean_distance(flight.gc), "\n")
Average path length: 5.464584 
Code
cat("Diameter:", igraph::diameter(flight.gc, weights=NA), "\n")
Diameter: 14 
Code
flight.gc.undirected <- igraph::as.undirected(flight.gc, mode="collapse")
cat("Transitivity:", igraph::transitivity(flight.gc.undirected), "\n")
Transitivity: 0.1124859 
Code
# Vertex and edge connectivity of giant component
cat("Vertex connectivity:", igraph::vertex_connectivity(flight.gc), "\n")
Vertex connectivity: 0 
Code
cat("Edge connectivity:", igraph::edge_connectivity(flight.gc), "\n")
Edge connectivity: 0 
Code
# Articulation points (cut vertices)
cut.verts <- igraph::articulation_points(flight.gc)
cat("Number of articulation points:", length(cut.verts), "\n")
Number of articulation points: 310 
Code
cat("Fraction of vertices that are cut vertices:",
    length(cut.verts)/igraph::vcount(flight.gc), "\n")
Fraction of vertices that are cut vertices: 0.3301384 
Code
# Dyad census for the directed flight network
igraph::dyad_census(flight_network)
$mut
[1] 95

$asym
[1] 1858

$null
[1] 1342027
Code
# Reciprocity
cat("Reciprocity (default):", igraph::reciprocity(flight_network, mode="default"), "\n")
Reciprocity (default): 0.09277344 
Code
cat("Reciprocity (ratio):", igraph::reciprocity(flight_network, mode="ratio"), "\n")
Reciprocity (ratio): 0.04864311 

K-Core Decomposition

Code
# K-core decomposition
cores <- igraph::coreness(flight_undirected)
cat("Max coreness:", max(cores), "\n")
Max coreness: 8 
Code
cat("Coreness distribution:\n")
Coreness distribution:
Code
table(cores)
cores
   0    1    2    3    4    5    6    7    8 
 129 1157  169   76   48   17    7   10   27 
Code
# K-core target plot for top hubs
A_sub <- as.matrix(igraph::as_adjacency_matrix(hubs_subgraph))
net_sub <- as.network.matrix(A_sub)
hubs_undirected <- igraph::as.undirected(hubs_subgraph, mode="collapse")
hub_cores <- igraph::coreness(hubs_undirected)
gplot.target(net_sub, hub_cores, circ.lab = FALSE,
    circ.col="skyblue", usearrows = FALSE,
    vertex.col=hub_cores, edge.col="darkgray",
    main="K-Core Decomposition (Top Hubs)")

Community Detection

Code
# Fast greedy community detection on the undirected network
fc <- igraph::cluster_fast_greedy(flight_undirected)
cat("Number of communities:", length(fc), "\n")
Number of communities: 434 
Code
cat("Community sizes:\n")
Community sizes:
Code
igraph::sizes(fc)
Community sizes
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
259  54 221  63  34  25  19  16  11  11  11  10   9  10   9  10   6   7  11   6 
 21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40 
  6   6   5   5   7   5   5   6   5   5   6   6   5   5   5   4   4   4   4   5 
 41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60 
  4   4   4   4   4   5   3   4   3   3   3   3   4   4   3   4   4   4   3   3 
 61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80 
  4   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3 
 81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100 
  3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3 
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
  3   3   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 
  2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2 
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 
  2   2   2   2   2   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1 
421 422 423 424 425 426 427 428 429 430 431 432 433 434 
  1   1   1   1   1   1   1   1   1   1   1   1   1   1 
Code
# Modularity of the partition
cat("Modularity:", igraph::modularity(fc), "\n")
Modularity: 0.7013626 
Code
# Community membership of top airports
top10 <- order(igraph::degree(flight_undirected), decreasing=TRUE)[1:10]
cat("Community membership of top 10 airports:\n")
Community membership of top 10 airports:
Code
mem <- igraph::membership(fc)[top10]
names(mem) <- V(flight_undirected)$name[top10]
print(mem)
KATL KORD KDFW KLAX KLAS KDEN KSEA KPHX KCLT KMSP 
   1    1    1    1    1    1    1    1    1    1 
Code
# Plot communities on the network
plot(fc, flight_undirected, vertex.size=3, vertex.label=NA,
     main="Flight Network Communities")

Code
# Dendrogram
library(ape)

Attaching package: 'ape'
The following objects are masked from 'package:sna':

    consensus, degree
The following objects are masked from 'package:igraph':

    degree, edges, mst, ring
The following object is masked from 'package:dplyr':

    where
Code
igraph::plot_dendrogram(fc, mode="phylo", cex=0.3)
title("Flight Network Community Dendrogram")

Spectral Analysis

Code
# Graph Laplacian analysis on the undirected network
k.lap <- igraph::laplacian_matrix(flight_undirected)
eig.anal <- eigen(k.lap, symmetric=TRUE)

# Plot first 50 eigenvalues
n_eig <- min(50, igraph::vcount(flight_undirected))
plot(eig.anal$values[1:n_eig], col="blue",
     ylab="Eigenvalues of Graph Laplacian",
     xlab="Index",
     main="Eigenvalue Spectrum (First 50)")

Code
# Fiedler vector analysis
n_v <- igraph::vcount(flight_undirected)
f.vec <- eig.anal$vectors[, n_v - 1]

# Color by community membership for comparison
f.colors <- igraph::membership(fc)
plot(f.vec, pch=16, col=f.colors,
     xlab="Airport Index",
     ylab="Fiedler Vector Entry",
     main="Fiedler Vector (Spectral Bipartition)")
abline(0, 0, lwd=2, col="lightgray")

Assortativity

Code
# Degree assortativity
cat("Degree assortativity:", igraph::assortativity_degree(flight_network), "\n")
Degree assortativity: 0.2211966 

Network Visualization (ggraph)

Code
# Visualize top hub subnetwork using ggraph
ggraph(hubs_subgraph, layout = "fr") +
  geom_edge_link(alpha = 0.1, color = "gray50") +
  geom_node_point(aes(size = igraph::degree(hubs_subgraph)),
                  color = "steelblue", alpha = 0.7) +
  scale_size_continuous(range = c(1, 8), name = "Degree") +
  theme_void() +
  labs(title = "Top Flight Hub Network (Fruchterman-Reingold Layout)")